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Abstract 

O '. In this paper, we investigate a conjecture by von Haeseler concerning the Max- 



imum Parsimony method for phylogenetic estimation, which was published by the 

Newton Institute in Cambridge on a list of open phylogenetic problems in 2007. 

(N 

^ ' This conjecture deals with the question whether Maximum Parsimony trees are 

^O . hereditary. The conjecture suggests that a Maximum Parsimony tree for a partic- 

G^ 

^^ , ular (DNA) alignment necessarily has subtrees of all possible sizes which are most 

l> 

^D I parsimonious for the corresponding subalignments. We answer the conjecture afhr- 



matively for binary alignments on five taxa but also show how to construct examples 
for which Maximum Parsimony trees are not hereditary. Apart from showing that a 



j^ . most parsimonious tree cannot generally be reduced to a most parsimonious tree on 

fewer taxa, we also show that compatible most parsimonious quartets do not have 
to provide a most parsimonious supertree. Last, we show that our results can be 
generalized to Maximum Likelihood for certain nucleotide substitution models. 

Keywords: phylogenetics, maximum parsimony, maximum likelihood, Jukes-Cantor 
model 
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1 Introduction 



Tree reconstruction methods for inferring phylogenetic trees are used to interpret the ever- 
growing amount of available genetic sequence data. Uns urprisingly, such m ethods have 



there fo re been widely dis c ussed in the last dec ades (e.g., JFelsenstein 



2nn4J : [Semple and Steel . 



Yang] . 



1978| : [Felsenstein 



One of the most fr equently used tree 



Fitch 



1971| or Maximum Par- 



reconstruction methods is the so-called Fitch parsimony 
sim,ony method (MP). Two of the reasons for its popularity are its simplicity compared 
to other methods such as Maximum Likelihood as well as its purely combinatorial basic 
principle. The latter makes MP a method that can be applied to any data alignment 
without any assumptions on the way the data has been generated, which means for the 
DNA that no assumptions on the probability of a nucleotide substitution have to be made 
(which is why MP is often said to be 'model- free'). Despite this simplicity, not all aspects 
of MP are to-date understood. One of the questions that remained unsolved for quite 
some time is whether MP trees are hereditary, i.e. if for an MP tree of an alignment 
on m taxa we can find a subtree of this tree of size k (for all A; = 4, . . . , m — 1) which 
is most parsimonious for the corresponding subalignment. This problem was submitted 
by Arndt von Haeseler to the Isaac Newton Institute's list of open phylogenetic prob- 



lems in 2007 (see http://www.newton.ac.uk/programmes/PLG/conj.pdf) as well as to 
the 'Penny Ante' list of the Annual New Zealand Phylogenetics Meeting in Kaikoura in 
2009 (see http://www.math.canterbury.ac.nz/bio/events/kaikoura09/penny.shtml). The 
importance of the conjecture is manifold. Biologically, MP trees with no MP subtrees 
seem quite counterintuitive as one would expect the MP tree to be related to MP trees 
on fewer taxa. Particularly when outgroups are included in a DNA analysis, one would 
want the topology of the rest of the tree to be independent of the outgroup, so the topol- 
ogy of the 'best' tree should be independent of the presence or absence of the outgroup. 
Moreover, if the conjecture was true, there would be sequences of MP trees, starting from 
four taxa and growing one new leaf at a time, leading to each MP tree of the whole align- 
ment under consideration, so the big trees would 'inherit' their MP property from smaller 
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trees. Mathematically, such a property would be particularly interesting with regards 
to inductive proofs or dynamic programming. However, we show in this paper that the 
conjecture is not in general true, but does hold in some special cases like in the case when 
the alignment is homoplasy-free or when the alignment is binary and there are only five 

XclXcL. 

While the above mentioned aspects of heredity basically refer to reducing large MP 
trees to smaller ones, we also consider the opposite scenario: we show that even if an 
alignment has only unique MP quartet trees for all 4-taxa subalignments and even if 
all these quartets are compatible with one another, the supertree comprising all these 
quartets is not necessarily an MP tree for the original alignment. This means that MP 
quartets cannot generally be combined into larger MP trees. 

Last, we investigate the impact these findings concerning MP have on Maximum Likeli- 
hood (ML) under the (generalized) Jukes-Cantor model (also known as iVr-model). In this 



analy sis, we use the strong relationship of MP and ML as described in Tuffley and Steel 



19971 ] and conclude that the cases that are problematic for MP also turn out to be problem- 



atic for ML under the A^^-model, even if there is a common mechanism of site evolution. 



2 Notation and Model Assumptions 

Recall that an unrooted binary phylogenetic X-tree is a tree T = {V{T), E{T)) on a leaf 
set X = {1, ... , m} C V(T) with only vertices of degree 1 (leaves) or 3 (internal vertices). 
In this paper, when there is no ambiguity we often just write 'tree' when referring to an 
unrooted binary phylogenetic X-tree. Furthermore, recall that a character / is a function 
f : X ^ C for some set C := {ci, C2, C3, . . . , c^} of r character states {r eN). An extension 
of / to V{T) is a map g : V{T) — ?■ C such that g{i) = f{i) for all i in X. For such an 
extension g of /, we denote by Wid) the number of edges e = {«, v} in T on which a 
substitution occurs, i.e. where g{u) 7^ g{v). The parsimony score of / on T, denoted by 
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W{f)i is obtained by minimizing lr{9) over all possible extensions g. The parsimony score 
of a sequence of characters S := /1/2 ■ ■ ■ fn is given by It{S) = ^ Wi.fi)- Note that S 

i=l 

cannot only be viewed columnwise as a sequence of characters, but also rowwise as aligned 
(DNA) species data. In this paper, we therefore use the terms 'sequence of characters' 
and 'alignment' synonymously when there is no ambiguity. Moreover, we denote by f — k, 
S — k and T — k the restriction of /, S and T, respectively, on the set X — A;; so A; G X 
is the taxon that is present in /, S and T but not in f — k, S — k and T — k. 

A character / is said to be homoplasy-free on a tree T if Irif) = I/I ~ 1 > where |/| 
denotes the number of character states employed by /. A sequence S of characters is 
called homoplasy-free when all its characters have that property. Note that if a character 
or an alignment is homoplasy-free on a certain tree, this tree minimizes its parsimony 
score and is therefore most parsimonious for this character or alignment, respectively. 

Recall that a character / on a leaf set X is said to be informative (with respect to 
parsimony) if at least two distinct character states occur more than once on X. Otherwise 
/ is called non-informative. Note that for a non-informative character /, IrXf) = W{f) 
for all trees 71, TJ on the same set X of leaves. In this paper, we refer to a character 
always with its underlying taxon clustering pattern in mind, i.e. for instance we do not 
distinguish between AACC, CCAA and CCGG, and so on. 



Nevman 



197l|, also known as 



Next we describe the fully symmetric r-state model 
the A^T--n iodel, which underlies t he TufHey and Steel equivalence result concerning MP 
and ML Tufflev and Steel 



Consider a phylogenetic X-tree T arbitrarily rooted at one of its vertices. The A^^- 
model assumes that a state is assigned to the root from the uniform distribution on 
the set of states. The state then evolves away from the root as follows. The model 
assumes equal rates of substitutions between any two distinct character states. For any 
edge e = {u,v} G E{T), where u is the vertex closer to the root, let pe denote the 
conditional probability P{v = Ci\u = Cj), where Cj 7^ Cj. The probability Pe is equal for 
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all pairs of distinct states Cj and Cj. Therefore, the probability that a substitution (cj 
to a state different from Cj) occurs on the edge e is (r — l)pe- Let qe be the conditional 
probability P{v = Ci\u = q), i.e. the probability that no substitution occurs on edge 
e. In the A^^-model, we have < Pe < - for all e G E{T), and (r — l)pe + Q'e = 1- 
Moreover, the Nr-model assumes that substitutions on different edges are independent. 
Note that for r = 4 , th e A^^-model coincides with the well-known Jukes-Cantor model 



Jukes and Cantor 



1969 |. 



Let T be a phylogenetic X-tree and let / be a character on its leaf set X. Let the 
substitution probabilities assigned to the edges of T under the A'j.-model be collectively 
denoted by p := {pe '■ e E E{T)). Then we denote by P{f\T,p) the probability of 
observing character / given tree T and the parameter values p. Note that P{f\T,p) does 
not depend on the root position since the model is symmetric. The maximum value of this 
probability for fixed / and T as p ranges over all possibilities is denoted by maxP(/|T), 
i.e. maxP(/|T) := maXpP(/|T, p). 

Now let iS := /i . . . /„ be a sequence of characters. When we refer to a sequence of char- 
acters under the A^^-model with no common mechanism, this means that the substitution 
probabilities on edges may be different for different characters in S without any correlation 
between the characters. We suppose that for each character /j in the sequence and for each 
edge e of the tree, there is a parameter p^.i that gives the substitution probability for /j 
on edge e. When there is no common mechanism, the parameters pe,j are all independent. 
For i = 1, ... ,n, let Pi := {peA '■ e G E{T)) be the vectors of substitution probabilities. We 
denote the model parameters {pi, i = 1, . . . ,n) collectively as 6 and refer to P{S\T, 6) as 
the probability of observing sequence S given the phylogenetic tree T and model param- 
eters 0. We then define the likelihood of the tree T and the model parameters 6 given 
the sequence S, which we refer to as the likelihood function, as L(T, Q\S) := P{S\T, 0). 
The maximum likelihood method of phylogenetic tree reconstruction involves optimizing 



the likelihood function in two steps as described in Semple and Steel 



20031 . We first 
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maximize P(>S'|T, Q) over the space of model parameters G. We define: 



maxP{S\T) := max P{S\r,e). 



We then choose a tree T that maximizes max P{S\T). We call such a tree a maximum like- 
lihood tree (ML-tree) of S. Thus, an ML-tree of a sequence S is argmax-7- (maxP(S'|T)). 
Note that under the assumption of no common mechanism, i.e. the characters in an 
alignment are regarded independent of one another, we have: 



maxP(S'IT) = TTm_axP(/i|T, Pi 



1=1 



3 Results 

3.1 Heredity Part I: Inferring small MP trees from larger ones 

As explained in Section [H we analyze cases in which MP trees are or are not hereditary. 
In particular, we examine whether a most parsimonious tree on an alignment needs to 
be related to most parsimonious trees on subalignments. In fact, the conjecture under 
investigation suggests a sequence of MP trees of sizes leading from the number of taxa 
considered down to four, where small MP trees are subtrees of the larger ones. Note that 
as there is only one unrooted tree on one, two and three taxa, respectively, the conjecture 
does not consider these cases as subtrees of these sizes are unique and therefore always 
most parsimonious. 

We now formulate the conjecture mathematically. 

Conjecture 1 (Conjecture PCS from the Isaac Newton Institute's 'Phylogenetics: Chal- 
lenges and Conjectures' list 2007). Let S := /1/2 ■ ■ ■ fn be a sequence of characters ('align- 
ment') on the set X of taxa, where \X\ = m, and let T be a Maximum Parsimony tree 
for S. Then, for each /c = 4, . . . , m — 1, there exists a subset Y of X of size k so that T\y 
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is an MP tree for S\y (where S\y is the sequence S of characters restricted to the taxa in 
Y and T\y is the tree T restricted to the taxa in Y). 



While finding an MP tree is generally NP-hard |Foulds and Grahaml . Il982l |. if this 
conjecture was true it might be relevant for dynamic programming approaches concerning 
certain instances of parsimony. Note that the conjecture does not state which particular 
subtree would be most parsimonious - so the conjecture is not in confiict with the NP- 
hardness of Maximum Parsimony and therefore could be valid. Moreover, mathematically 
a statement like that given in the conjecture would be useful to investigate theoretical 
properties of MP using inductive proofs, as the inductive step in such proofs requires 
knowledge on smaller instances of the problem under investigation. 

In the following, we will present two special cases of the conjecture, namely the case 
in which the given alignment is homoplasy-free as well as the case where the alignment 
is on five taxa and employs only binary characters. In these cases, the conjecture is true. 
Moreover, we afterwards analyze more general cases where the conjecture fails. 

We need the following lemma in order to prove a first positive result concerning Con- 
jecture dJ 

Lemma 3.1. Let T he an unrooted binary phylogenetic X-tree for a taxon set X with 
\X\ = m and let f he a homoplasy-free character on T. Let k & X be a taxon. Then, 
f — k is homoplasy-free onT — k. 

Proof. By definition of homoplasy, Wif) = I/I ^ 1- For any taxon k E X, note that the 
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parsimony score of the character f — k on T — k can be calculated as follows: 



h^kU - k) 



Irif) = I/I ~ 1 if the character state Ck of taxon k is not unique in /, 

/r(/)-l = |/|-2 else, 

\f-k\-l ii\f-k\ = \fl 
\f-k\-l if|/-fc| = 1/1-1. 

So altogether Ir-kif ~ ^) = 1/ ~ ^i ~ 1- Thus, f — k is homoplasy-free on T — k. D 

Now we are in the position to state the first heredity result. 
Theorem 3.2. ConjectureUlis true if S is homoplasy-free. 

Proof. Let 5" be a homoplasy-free alignment with MP tree T and taxon set X = {1, ... , m}. 



Then, by Lemma 13.11 for any taxon k E X we conclude that the restriction S — k oi S 
on X — A; is homoplasy-free on the corresponding restriction T — A; of T. As explained in 
Section [21 homoplasy-free alignments are parsimoniously best possible, i.e. because S — k 
is homoplasy-free on T — k, T — k is a.n MP tree for S — k. We repeat this argument to 
derive the desired sequence of MP trees from m— 1 taxa down to 4 taxa. This completes 
the proof. D 

An example for heredity of homoplasy-free alignments is depicted in Figure [H 

In order to investigate Conjecture [1] for general alignments, we now describe the idea 
underlying the following results. 

Main Idea. If for m taxa there exist p distinct characters (or, more precisely, character 

patterns, cf. Section|2]) fi, . . . , fp, the parsimony score of an alignment S on an m-taxa tree 

p 
T can be expressed as ^ Xil-j-{fi), where Xj denotes the number of times the character fi 

i=l 

P 

occurs in S (note that this implies 15*1 = '^Xi). So the fact that a tree T is parsimoniously 
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AAA 
AAA 
CAA 
CCA 
CCC 
CCC 




r-6 




1 r-{5,6} 3 





S-6 



AAA 
AAA 
CAA 
CCA 
CCC 



S'-{5,6} 



AAA 
AAA 
CAA 
CCA 



Figure 1: Illustration of Theorem 13.21 Alignment S is homoplasy-free on tree T, so all subalignmcnts arc homoplasy-free 
on the corresponding subtrees. 

better than another tree T concerning some ahgnment S can be expressed in terms of 

p p 

the inequahty Yl ^drifi) < Yl ^dfifi)- The same can be done for subahgnments and the 

corresponding subtrees, so that altogether Conjecture [T] leads to a system of inequalities 

that need to be fulfilled by a potential counterexample. Such systems can then be tackled 

with the help of computer algebra systems. 

We now use the idea explained above to prove the following statement on five taxa. 

Theorem 3.3. ConjectureUl is true in the case where \X\ = m = 5 and S = fi . . . fn is 
binary, i.e. fi, . . . , fn are 2-state characters. In particular, if a tree T = ((1, 2), 3, (4, 5)) 
as depicted in Figure IB is an MP tree for such an alignment S , then the tree T — 3 = 
((1, 2) (4, 5)) as depicted in Figure\^ is an MP tree for the alignment 5 — 3, which results 
from S when taxon 3 is deleted. 



Proof. Let /i := AACCC, /a := ACACC, /g := ACCAC, f^ := AC CCA, /s := 
AC CAA, /e := AC AC A, fj := ACAAC, /§ := AACCA, fg := AACAC, /lo := AAACC 
be the ten parsimoniously informative characters on five taxa. Let S" be a binary align- 
ment on five taxa. Without loss of generality, we assume that tree T = ((1,2), 3, (4, 5)) as 
depicted in Figure [2] is most parsimonious for S (otherwise we re-label the leaves). This 
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T 




2 3 5 

Figure 2: Tree T = ((1, 2), 3, (4, 5)), which is an MP tree for some given alignment 5. 

1 r-3 4 





Figure 3: If T = ((1, 2), 3, (4, 5)) is an MP tree for some given ahgnment S, tree T - 3 = ((1, 2)(4, 5)), is an MP tree for 
5-3. 

particularly implies that 

It{S) < lf{S) as well as lr{S) < lf{S), (1) 

where T = ((1,4), 3, (2, 5)) and T = ((1, 5), 3, (2,4)) are the trees depicted in Figure HI 
Note that we may ignore non-informative characters as they have the same score on all 
trees. Therefore, we can think of S* as a combination of characters /i, . . . , /lo, which occur 
Xi, . . . ,a;io times in 5", respectively. We then rewrite Inequality [T] as follows: 

10 10 10 10 

^Xilrifi) < ^Xilfifi) and ^Xilrifi) < ^Xilf{fi). (2) 

i=l 1=1 i=l i=l 

Calculating the parsimony scores for /i , . . . , /lo on trees T, T and T, respectively, we get 

hifi) = WUw) = Ifih) = ^fifr) = Wifi) = hife) = 1' ^^^ ^^^ ^^^^^ parsimony scores 
are equal to 2. We now rewrite Inequality [2] using these scores: 
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10 



xi + 2X2 + 2X3 + 2X4 + 2X5 + 2X6 + 2x7 + 2x8 + 2x9 + xio < 2xi + 2x2 + X3 + 2x4 + 2x5 + 2x6 + xj + 2x8 + 2x9 + 2xio 

■^ x^ + x^ < Xi + XiQ (3) 
and 

XI + 2X2 + 2x3 + 2X4 + 2X5 + 2x6 + 2x7 + 2x8 + 2x9 + xio < 2xi + 2x2 + 2x3 + X4 + 2x5 + xe + 2x7 + 2x8 + 2x9 + 2xio 

<^ X4 + xg < xi + Xio (4) 




T 





T 




3 

Figure 4: If tree 7" depicted in Figure |2] is an MP tree for an alignment S, the trees T = ((1, 4), 3, (2, 5)) and 7" 
((1, 5), 3, (2, 4)) cannot have better parsimony scores for S than T, which leads to Inequality JlJ. 



Now we assume that the subtree T — 3 = ((12), (45)) of T is not most parsimonious. 
This implies that at least one of the two alternative trees, namely T — 3 = ((14), (25)) or 
T— 3 = ((15), (24)), must be strictly better than T — 3 in the sense of parsimony. Using 
the above argument, we get 



It^^{S - 3) > IfJS - 3) or lr-z{S - 3) > lf^^{S - 3) 



(5) 



Calculating the parsimony scores of /i — 3, . . . , /lo — 3 on trees T — 3, T — 3 and T — 3, 
respectively, we get ;r-3(/3 - 3) = /r-3(/4 - 3) = W-zUq - 3) = W^zifi - 3) = 
/r-3(/i-3) = /t-3(/4-3) = /t.3(/6-3) = /t-3(/io-3) = /r-3(/i-3) = /t-3(/3-3) = 
If^^ifj — 3) = If^^ifio — 3) = 2, and all other parsimony scores are equal to 1. We now 
rewrite Inequality [5] using these scores: 
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a;i + 2:2 + 2x3 + 2x4 + X5 + 2x6 + 2x7 + xg + xg + xio > 2xi + X2 + X3 + 2x4 + X5 + 2x6 + X7 + xg + xg + 2xio 

■v^ X3 + X7 > Xi + Xio (6) 
or 

xi + X2 + 2x3 + 2x4 + X5 + 2x6 + 2x7 + xg + xg + xio > 2xi + X2 + 2x3 + X4 + X5 + X6 + 2x7 + xg + xg + 2xio 

■x^ X4 + Xq > Xi + Xio (7) 

As either Inequahty (E]) or ([7]) must hold, this contradicts either (E]) or 01]). Therefore, 

r - 3 is an MP tree for S - 3. D 



Next we show that the result presented in Theorem 13.31 cannot be generalized to r- 
state characters for r > 2. In fact, not only is it possible that the particular subtree 
((12), (45)) of a most parsimonious tree ((12), 3, (45)) is not most parsimonious for the 
corresponding subalignment. It is even possible that the most parsimonious tree does not 
have any most parsimonious 4-taxa subtree at all. 

Proposition 3.4. ConjectureUl is not generally true for multistate characters, even if the 
tree under consideration is the only MP tree. 

Proof. We construct an explicit example employing three character states. Let /i, . . . , /lo 
be defined as in the proof of Theorem 13. 3[ Additionally, we define /n := AACCT, 
/i2 := AACTC, /i3 := AATCC, /m := ACACT, f,, := ACATC, /ig := ATACC, 
fi7 := ACCAT, /i8 := ACTAC, /ig := ATCAC, /20 := ACCTA, /21 := ACTCA, 
/22 := ATCCA, /23 := TAACC, /24 := TACAC and /25 := TACCA. Note that there 
is no parsimoniously informative 5-taxa character employing more than three states, so 
/i, . . . , /25 is the complete list of parsimoniously informative characters on five taxa. Now 
if a tree T = ((12), 3, (45)) shall be the unique MP tree for an alignment 5" and none 
of the 4-taxa subtrees of T, i.e. T - 1 = ((23), (45)), T - 2 = ((13), (45)), T - 3 = 
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((12), (45)), r-4 = ((12), (35)) and r-5 = ((12), (34)), shall be most parsimonious for 
the corresponding subalignments of S, this can be expressed with the help of the following 
system of inequalities (as in the proof of Theorem 13.31 we can ignore non-informative 
characters without loss of generality) : 

25 25 

for all r 7^ r wc have ^ Xilr{fi) < ^ Xilf{fi) 

and for each j = 1 , . . . , 5 there 



j=l 1=1 

25 25 



exists some tree r^> r such that Z^^i^T-jifi j) > 2^Xilfj_j[fi j, 



i=l i=l 



Using a computer algebra system, we find that one possible solution is 0:4 = 1, xe = 2, 
Xii = 1, Xi3 = 2, X23 = 1 and all other Xi = 0. This gives the following alignment: 



A A A A A A A 
C C C A A A C 
g._)CAACCCC 
C C C C T T T 
A A A T T T T 



So S is an alignment with unique MP tree T = ((12), 3, (45)) and no most parsimonious 
4-taxa subtree, which can be verified by examining all five distinct characters employed 
by S on all 15 trees on five taxa and their corresponding 4-taxa subtrees. D 

So for five taxa, the question whether or not Conjecture [1] holds depends on the number 
of character states the alignment employs: for two character states it holds, whereas it 
fails for three or more states. Next we show that this distinction cannot be generalized to 
more than five taxa. In fact, we use our approach of solving inequality systems in order 
to generate an alignment S on six taxa with unique MP tree T = (((12), 3), (4, (5,6))), 
which has neither a most parsimonious 5-taxa subtree nor a most parsimonious 4-taxa 
subtree for the corresponding subalignments of S. 
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Proposition 3.5. ConjectureUlis not generally true for more than five taxa, even if only 
binary characters are employed and the tree under consideration is the only MP tree. In 
fact, a unique MP tree might not have any (non-trivial) most parsimonious subtree at all. 

Proof. Consider all informative binary characters on six taxa: /i = AACCCC, /2 = 
ACACCC, /s = ACCACC, f^ = ACCCAC, /g = ACCCCA, /e = ACCAAA, /r = 
ACACAA, fs = ACAACA, fg = ACAAAC, fw = AACCAA, fu = AACACA, /12 = 
AACAAC, /i3 = AAACCA, f^ = AAACAC, /15 = AAAACC, /ig = AAACCC, 
fn = AACACC, fis = AACCAC, fig = AACCCA, /20 = ACAACC, /21 = AC AC AC, 
/22 = ACACCA, /23 = ACCAAC, /24 = ACCACA and /25 = ACCCAA. Now we 
construct an example analogously to the construction shown in the proof of Proposition 



13.41 and find that the alignment S employing two copies of /s, five copies of /4, four copies 
of /s, one copy of /y, nine copies of /§, six copies of fg, eleven copies oi fu, nine copies of 
/12, three copies of /13, two copies of /14, seven copies of /le, one copy of /19, four copies of 
/20 and six copies of /25 has the desired properties. This alignment is depicted in Figure 
[51 It has a unique MP tree, namely T = (((1, 2), 3), (4, (5, 6))) as depicted in Figure 
[6l The 5-taxa and 4-taxa MP trees for S are depicted in Figures [7] and [8|, respectively. 
If the reader wishes to verify that these result s are correct, we strongly recommend the 



program 'Penny' from the free Phylip-package Felsenstein 



2005| , which is able to run an 



exhaustive search through the tree space for binary character sequences. D 



AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
CCCCCCCCCCCCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCC 
CCCCCCCCCCCAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCAAAAAAAAAAAACAAAACCCCCC 
AACCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCAAAACCCCCC 
CCAAAAACCCCACCCCCCCCCAAAAAACCCCCCCCCCCAAAAAAAAACCCAACCCCCCCCCCCCAAAAAA 
CCCCCCCAAAAAAAAAAAAAACCCCCCAAAAAAAAAAACCCCCCCCCAAACCCCCCCCCACCCCAAAAAA 

Figure 5: Alignment S as defined in the proof of Proposition 13.51 has a unique MP tree 7" (cf. Figure |6ll, wfiicfi fias no 
MP subtrees. 



Note that the example presented in the proof of Theorem 13.51 concerns tree T = 
(((1, 2), 3), (4, (5, 6))), analogous examples can be constructed for the other tree shape on 
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r 



3 4 




6 



Figure 6: Tree T = (((1, 2), 3), (4, (5,6))) is the unique MP tree for S shown in Figure [5] but has no most parsimonious 
subtrees. 




6 4 5 

MP for 5 - 2 



MP for 5 - 3 

1 2 




3 '6 

MP for 5 -5 



MP for 5 -6 

1 2 



Figure 7: Illustrations of all 5-taxa MP trees for the corresponding subalignments of alignment S as defined in the proof 
of Proposition [33] None of these trees is a subtree of T shown in Figure|51 which is the unique MP tree of S. 



six taxa, namely (((1, 2), (3, 4), (5, 6))) (example not shown). We conclude that in general, 
MP trees for an alignment do not have to be related to MP trees on subalignments. This 
surprising result shows once again that MP, while being a simple combinatorial algorithm, 
is more complicated than one might intuitively think. As explained above, the existence 
of such instances is not immediately clear because of the NP-hardness of finding the set 
of most parsimonious trees in the tree space. It rather adds another complicated aspect 
to an already hard problem. 
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Figure 8: Illustrations of all 4-taxa MP trees for the corresponding subalignments of alignment S as defined in the proof 
of Proposition [33] None of these trees is a subtree of T shown in Figure [B] which is the unique MP tree of S. 

3.2 Heredity Part II: Constructing large MP trees from smaller 
ones 



In the previous section, we showed that Maximum Parsimony trees are not in general 
hereditary in the sense of allowing for the inference of smaller MP trees by known larger 
ones. In the present section, we approach a different aspect of heredity: Given an align- 
ment, is it possible to combine small compatible MP trees, e.g. quartets, to derive an 
MP tree for the entire set of taxa? Intuitively, one might think that it is quite likely that 
this is true, as the asumption of compatibility of the MP-quartets is a strong condition. 
Moreover, one might think that if additionally the MP-quartets are all unique, which is 
another strong condition, it is even more likely for such a statement to hold. However, in 
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this section we present a counterexample which shows that even under these seemingly 
ideal conditions it may be impossible to infer large MP trees from the smaller ones. 

Proposition 3.6. If for an alignment S on the taxa set X all most parsimonious quartet 
trees (for taxa sets {xi,X2,Xs,X4} C X) on the corresponding subalignments of S are 
compatible with an X-tree T, this tree T does not need to be an MP tree for S . This is 
even true if the MP-quartets and the MP tree for S are all unique. 

Proof. We prove the proposition by providing an explicit counterexample. Consider the 
following binary alignment: 

AAAAAAAAAAAAAAA 
AACCCAAAAAAAAAA 
CCAAACCCCCCAAAA 
CCCCCCCCAAACCCC 
CCCCCAAACCCCCCC 



S :-- 



S consists of two copies of /i, three copies of /2, three copies of /§, three copies of /g and 



four copies of /lo, where /i, • • • , /lo are defined as in the proof of Theorem 13. 31 Using 
an exhaustive search through the space of all 15 trees on five taxa, whic h for binary 



align ments is provided e.g. by the 'Penny' program of the Phylip package Felsenstein 



2005|], we find that S has the unique MP tree T = ((1, 3), 2, (4, 5)) depicted in Figure [IDl 
Moreover, subalignment S — 1 has the unique MP quartet tree T — 1 := ((2,3), (4,5)), 
subalignment S — 2 has the unique MP quartet tree T — 2 := ((1,3), (4,5)), subalignment 
S — 3 has the unique MP quartet tree T — 3 := ((1,2), (4,5)), subalignment 5" — 4 has the 
unique MP quartet tree T— 4 := ((1, 2), (3, 5)) and subalignment 5" — 5 has the unique MP 
quartet tree T — 5 := ((1,2), (3,4)). All these trees are depicted in Figure [91 Note that 
trees T— 1, T— 2, T— 3, T— 4 and T— 5 are all compatible with tree T := ((1, 2), 3, (4, 5)) 
depicted in Figure [TOl whereas 7^ — 4 and T — 5 are incompatible with T. So the unique 
and compatible MP quartets cannot be combined to give the unique MP tree for the 
whole alignment. This completes the proof. D 
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Figure 9: Illustration of the unique most parsimonious quartet trees of S as defined in the proof of ProDosition l3.6l 




T 
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T 




Figure 10: Tree T is the unique MP tree of S as defined in the proof of Proposition 13.61 but tree T = ((1, 2), 3, (4, 5)) is 
the only tree that is eompatiblo with all unique MP quartet trees. 



While Section 1371] shows that in general large MP trees cannot be used to infer smaller 
MP trees on subsets of the taxon set, the above example shows that the opposite is also 
impossible, even under strong compatibility conditions. So MP is a phylogenetic tree 
inference method that may find that unique 'best' trees are unrelated to 'best' trees on 
subsets or supersets of the taxa under consideration. As this is somewhat counterintuitive, 
naturally the question arises whether this problem only occurs with MP or also affects 
other methods. In the next section, we will generalize our results to Maximum Likelihood 
(ML) under a frequently used nucleotide substitution model. 
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3.3 Heredity Part III: Impacts of the parsimony results on Max- 
imum Likelihood 

In the following, we will examine the impacts of the results presented in Sectio ns 13.11 and 



3.21 o n Maximum Likelihood (ML) under the so-called A^^-model introduced in 



Neyman 



971 1 and explained in Se ction [21 The A'4-model is also known as the Jukes Cantor model 



Jukes and Cantor 



19691 . It may be biologically justified to assume that the substitution 
probabilities on an edge of the tree are the same for each character in an alignment, 
particularly if the characters are close to one another in the alignment. In this case, we 
say these characters evolved under a common m,echanism,. If, on the contrary, different 
mechanisms are allowed to operate at each site, we say there is no common mechanism,. 
For both cases, Tuffley and Steel presented results t hat closely link Maximum Likelihood 



with Maximum Parsimony Tuffley and Steel 



1997 | 



Theorem 3.7 (Equiva^ 



Tufflev and Steel 



ence of MP and ML under no common mechanism, Theorem 5 of 



19971]). Maximum parsimony and maximum likelihood with no common 



mechanism are equivalent in the sense that both choose the same tree or trees. 



Theo rem 3.8 (Agreement of ML with MP under the A^^-model, Theorem 7 of Tufflev and Steel 



1997|). For data containing enough constant characters, the maximum likelihood tree un- 



der the Nr-model is a maximum parsimony tree. 



The consequences of these theorems in combination with Sections 13.11 and 13.21 can be 
described as follows: if we assume a common mechanism, all examples provided in these 
sections immediately lead to analogous results for Maximum Likelihood by Theorem 13.71 
If no common mechanism is assumed, the examples provided in these sections may need 
to be modified in the sense of adding constant characters. These extra characters do 
not change the MP tree as constant characters are non-informative, but they will make 



ML agree with MP according to Theorem 13.81 So in both cases, we derive alignments 
for which the ML tree is not hereditary. The following corollaries are therefore direct 



conclusions from the previous sections combined with Theorems 13.71 and 13.81 
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Corollary 3.9. Let S := /1/2 • • • /« be a sequence of characters (alignment) on the set X 
of taxa, where \X\ = m > 'b , and let T he a Maximum Likelihood tree for S. Then, under 
the Nr-model (with or without assuming a common mechanism) , there may be a subset 
Y of X of size at least four, such that T|y is not an ML tree for S\y (where S\y is the 
sequence S of characters restricted to the taxa in Y and T\y is the tree T restricted to 
the taxa in Y). In fact, a unique ML tree might not have any (non-trivial) most likely 
subtree at all. 

Proof. This result is a direct conclusion of Theorems 13.71 and 13. 8[ respectively, to the 
examples (if required filled up with sufficiently many constant characters) constructed in 
the proofs of Propositions 13.41 and 13. 5[ D 

Corollary 3.10. If for an alignment S on the taxa set X all ML quartet trees (for taxa 
sets {xi, a;2,X3, X4} C X) on the corresponding subalignments of S are compatible with 
an X-tree T, this tree T does not need to be an ML tree for S. This is even true if the 
ML- quartets and the ML tree for S are all unique. 



Proof. This result is a direct conclusion of Theorems 13.71 and 13. 8[ respectively, to the 
example (if required filled up with sufficiently many constant characters) constructed in 



the proof of Proposition 13.61 D 

4 Discussion 

In this paper, we presented various examples of non-hereditary Maximum Parsimony and 
Maximum Likelihood trees together with an idea of how to construct them as solutions 
to systems of inequalities. The results show that there are alignments for which the 'best' 
tree with respect to one of these phylogenetic tree inference methods does not have to be 
related to the 'best' tree on fewer taxa. Also, even if a tree is constructed from uniquely 
'best' and compatible quartet trees, it might not coincide with the 'best' tree on all taxa. 
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On the one hand, these facts might h e 



and ML is hard 



Foulds and Graham 



p to u nderstand w h y tree reconstruc t ion fo r MP 



1982J, JRoch 



2006) . [Chor and TuUer 



20061). On 



the other hand, the result is surprising and gives rise to new questions, e.g. whether or 
not one should include outgroups when inferring trees, as these might change the tree 
topology of the optimal tree. Naturally, it would also be interesting to know whether 
similar heredity problems occur with other methods, such as e.g. distance methods or 
ML under more complicated models of nucleotide substitution. We conjecture that these 
methods are also not hereditary in the sense described in this paper. 
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